Syllable-length path mixture hidden Markov models with trajectory clustering for continuous speech recognition
نویسندگان
چکیده
Recent research suggests that modeling coarticulation in speech is more appropriate at the syllable level. However, due to a number of additional factors that can affect the way syllables are articulated, creating multiple acoustic models per syllable might be necessary. Our previous research on longer-length multi-path models has proved that data-driven trajectory clustering to be an attractive approach to derive multi-path models. However, the use of single distribution with unvarying covariance to model a trajectory cluster may degrade its capability of detecting pronunciation variants. In this paper, we propose a new method, namely path mixture hidden Markov model, to alleviate the adverse effects of trajectory clustering. The improvement on performance observed in continuous speech recognition experiments show path mixture model is a very effective approach.
منابع مشابه
Microsoft Word - Hybridmodel2.dot
Today’s state-of-the-art speech recognition systems typically use continuous density hidden Markov models with mixture of Gaussian distributions. Such speech recognition systems have problems; they require too much memory to run, and are too slow for large vocabulary applications. Two approaches are proposed for the design of compact acoustic models, namely, subspace distribution clustering hid...
متن کاملWhither Linguistic Interpretation of Acoustic Pronunciation Variation
Recent research suggests that modelling pronunciation variation is more appropriate at the syllable level than at the level of contextdependent phones. Due to the large number of factors affecting syllable pronunciation, the creation of multi-path topologies is nec essary. Previous research on multi-path models in connected digit recognition has proved trajectory clustering to be an attractive...
متن کاملSpeaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Dual stream speech recognition using articulatory syllable models
Recent theoretical developments in neuroscience suggest that sublexical speech processing occurs via two parallel processing pathways. According to this Dual Stream Model of Speech Processing speech is processed both as sequences of speech sounds and articulations. We attempt to revise the “beads-on-a-string” paradigm of Hidden Markov Models in Automatic Speech Recognition (ASR) by implementing...
متن کامل